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ASSESSMENT DRIVES STUDENT LEARNING: EVIDENCE FOR SUMMATIVE 
ASSESSMENT FROM PAKISTAN 


Abstract: Research studies from various parts of the world indicate that university 
students find research methodology courses among the most difficult subjects to 
grasp. Students in Pakistan display similar attitudes towards learning of research. 
Those of us who teach research at the institutions of higher learning in Pakistan 
continuously hear students describe research as one of the most ‘difficult and dry’ 
subject they have to study. Hence, we as teachers of research at these institutions 
keep lookingfor ways to increase students’ interest in and academic achievement of 
research. In that spirit, we designed two assessment tasks for a research methodology 
course at Master’s level and used them to assess the difference in learning. For one of 
the assignments assessment was summative with final official grade at the end of the 
semester while the other was put through formative assessmentand no official final 
grade was assigned to it. The results of our study reinforce the centrality of assessment 
to the learning of students and indicate that students’ put more efforts in learning a 
task that carries final grade. Although our results do not support the effectiveness of 
formative assessment we have raised concerns about the ‘dialogical’ nature of 
feedback to students. Despite the fact that the context of our study is Pakistan, the 
implications of our research may be discussed in the larger context of teaching and 
learning of research especially in contexts where the medium of instruction is not local 
or national language. 
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Introduction 

Teaching and learning of research in the institutions of higher learning, by and large, is 
considered difficult by both teachers and students (Breuer and Schreier 2007; Rushing and 
Winfield 1999; Takata and Leiting 1987; Wiggins &. Burns 2009). In Pakistan we constantly hear 
students complaining about research courses as the most ‘difficult and dry’ subjects they have 
to study (Qureshi 2013; Vazir and Qureshi 2011). Even as researchers are worried about the 
general state of teaching in higher education institutions (Ali 2008; Hoodbhoy 2009; Khan 
2005; Raza, Naqui 81 Lodhi, 2011), teachers like us are particularly troubled by our students’ 
poor performance in research courses (Qureshi 2013). On our part for engaging students in 
research methods courses, we have been experimenting with various teaching strategies 
(Vazir and Qureshi 2011). Nevertheless, like Rushing and Winfield (1999) and Takata and 
Leiting (1987), our main focus was on ‘learning by doing’ approach to teaching research till 
Wood’s (2009) work drew our attention toward the use of assessment strategies for 
increasing students’ understanding of the research related concepts. Wood(20og:5) believes 
that “Assessment not only drives learning it may also help learning”. While still looking for 
ways to better ‘teach’ research we also started exploring effective strategies of assessment 
for gauging students’ learning (and our teaching) of particular concepts in research methods 
courses. What follows is the description of such an experiment that was carried out in one of 
the institutions of higher education in Pakistan. The purpose was to test whether assessment 
really drives learning in higher education institutions as Wood (2009) has suggested. 

Conceptual context of the study 

Research in education emphasizes the function of assessment as the driver of student 
learning (Al-Kadri 2013; Cowan and Cherry 2012; Garrison and Ehringhaus 2007; Guskey 2003; 
Stiggins 2002). By and large there is also agreement on the purposes of assessment; 
assessment for learning (formative) and assessment of learning (summative), but no 
consensus on the general effects of either type of assessment on students’ learning has been 
reached. Researchersadvocate formative assessment because they believe it contributes to 
classroom learning (Black et al.2006; Brown 2005; Dylan and Thompson20o8; 
Narciss2004;Norman et al. 2010; Rushton2005 among others). These people 
opposesummative assessment on grounds that it concentrates on end of the term or unit 
tests that may not reflect student's understanding of the concepts.While Rust (2011) opposes 
summative strategies because he thinks institutions “(mis)use numbers to judge and record 
students’ assessment” (p.i); Raupach et al. (2013) and Rodiger and Karpicke.(2006) are more 
optimistic about the positive role of summative strategies. Torrance (2007) and Harlen (2005), 
see assessment as learning and not of or for learning per se thus promoting a blend of the two 
whereas Knight (2002) is skeptical of both formative and summative assessment practices at 
the higher education level. 

In Pakistan, traditionally the emphasis has been on summative assessment in schools through 
higher education institutions with annual examinations at the end of each academic year 
(Ahmed et al. 2013). In 1970’s universities started moving away from annual to semester based 
system of schooling (Iqbal 2013). One of the mandates of the Higher Education Commission 
of Pakistan, established in 2002, is to promote a mix of formative and summative strategies of 
assessment (HEC website). At the same time the computerized National Admission Test, 
Graduate Assessment Tests and the establishment of National Testing Service (NTS) in 2002 
reflect inclination toward summative assessment. Majority of the educational institutions still 
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allocate larger share to summative vs. formative assessment of student achievements but 
very little research has been carried out to analyze the effectiveness of assessment practices 
in Pakistani institutions of higher learning (Ahmed et al. 2013; Buzdor et al. 2013; Hussain 2010; 
Iqbal 2004; Naeemullah 2007; Rehmani 2003; Shah 2002; Shirazi 2004; Urooj and Ahmed 2012). 
Rarely has any researcher in Pakistan contested the notion of assessment driving learning or 
done a comparison of which assessment practices work better. Therefore, the need for such 
research cannot be overemphasized. Hence like Raupach et al. (2013), we also decided to 
check if assessment really drives learning in higher education institutions and if so which of 
the two types, i.e., summative vs. formative is more at work? For this purpose we conducted a 
small scale correlational research study in an institution of higher learning in Pakistan. The 
purpose was to explore the relationship with the help of two interrelated research questions; 

1. To what extent assessment drives learning? 

2. What is the relationship between formative assessment and student learning? 

Physical context of the study 

One institution of higher learning in Pakistan was selected for this study which offers Master’s 
program in Education. Two intact groups of students (28, 32), enrolled in two classes of 
Research Methods in spring and fall semesters were chosen as convenience sample for the 
study. Both groups were taught by the same instructor and were exposed to similar content 
and teaching strategies. Total sample size was 60. As the focus of the study is the differences 
created by assessment practices, therefore, group comparisons per se are not included. 

Theoretical context of the study 

Based on the existing (and expanding) body of research literature in the field plus our 
observations and experiences of assessment practices in the institutions of Higher education 
in Pakistan, we theorize that students put more efforts into understanding and completing 
the tasks which are subjected to summative assessment. Students receive marks (and final 
grades) for such tasks; these grades are officially recorded and reflect ‘more’ of learning. At 
the same time, we also believe that students by and large are “conscientious consumer [s]” 
(Higgins, Hartley 8t Skelton, 2002, p.53); therefore, we expect that students pay more 
attention “and seek feedback which will help them to engage in their subject in a ‘deep’ way” 
(Higgins Hartley &. Skelton,2002:53). However, there is a caveat; students are more likely to 
translate feedback effectively into their assignments for summative assessment, hence, the 
impact of feedback will be more visible in ‘before’ and ‘after’ feedback comparison of tasks. 
But if we heed to Gibbs and Simpson (2003)’s warning that many students do not read anduse 
feedback for improving their understanding of the topic, then there would be either no or 
close to zero difference between the scores of summative and formative assignments. The 
present research was conducted to in order to test this theory. 

Study design and Procedure 

The purpose of our research was to compare two types of assessment practices, i.e., 
formative and summative but we did not separate the two classes as formative and 
summative groups because, a) it would have been unethical, and b) our purpose was not to 
compare groups per se but assessment practices which went beyond group boundaries. 
Hence, for this study every student went through the experience of both types of assessment; 
if summative assessment was applied to review article for spring semester, formative 
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assessment was applied for the fall class and the marks were not included in the final grade 
for fall semester. In the same way, questionnaire administration was formally assessed for the 
fall class but was given as home-task to spring class and the assessment was formative. Similar 
was the case for corresponding quizzes. Table 1 below presents the implementation scheme 


Table 1: Administration schema for assessment 



Spring 2011 

Fall 2011 

Summative 

• Review article plus 

• Questionnaire plus 

assessment 

• Corresponding quiz 

• Corresponding quiz 

Formative 

• Questionnaire as home task 

• Review article as home task 

Assessment 

• Corresponding quiz, taken in- 
class, formative assessment 

• Corresponding quiz, taken in- 
class, formative assessment 


Tools for assessment 

Two written assignments (worth 30 points each) and two quizzes (worth 20 points each) 
were administered for the purposes of assessing the relationship between assessment and 
learning. One assignment was to produce review article on a topic relevant to course material. 
The purpose of the assignment was to expose students to the state of research in their 
selected area and sharpen their analytical skills through summary, synthesis, comparison and 
evaluation of the research literature. For the other assignment students administered 
questionnaires in actual classrooms and had post-administration focus group discussions with 
a small sample of their students on item clarity, relevance and comprehension. The purpose of 
the assignment was to enhance students’ learning about the usage of questionnaire as a 
research tool for collecting teaching and learning related data. All students were required to 
make in-class presentations and receive feedback for future improvement. The week 
following the presentations, students were given quizzes. Final results were compared at the 
end of semesters. 

The summative assessment for this study was done in line with the ‘assessment of learning.’ 
For the students enrolled in spring semester it was measured as ‘the final grade obtained by a 
student at the end of the semester for writing the Review article assignment and its 
corresponding quiz. Similarly for the students enrolled in fall semester it was ‘the final grade 
obtained by a student at the end of the semester for the write up of the Questionnaire activity 
and its corresponding quiz. 

The formative assessment, on the other, hand is not that straightforward (see Capraro, et al., 
2011 for a review of operational definitions of the types). One practice of formative 
assessment, on which majority of educators/ researchers agree, is the role of ‘feedback’ in the 
learning of students. (Hattie and Timperley2007; Shute 2007). For the present study, written 
feedback was used as a formative assessment practice ‘for learning.’ The feedback was given 
to all students for both assignments and quizzes. They also received numeric marks for the 
current state of their work. The only difference was that the numeric marks students received 
for the Review article assignment and its corresponding quiz in spring and for the write up of 
the Questionnaire activity and its corresponding quiz in Fall respectively were considered 
‘formative’ grades and did not become part of their ‘official’ transcript. 
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To assess the learning from formative assessment, students’ score for quizzes were compared 
separately as all students received general instructions in class as well as specific feedback for 
further development. Therefore, other things being equal, there should be a positive 
relationship between feedback and student performance. Our position was that quizzes gave 
second chance to students for further improving their assignments. Hence we expected 
students with high scores on summative assignment would also have high score on quiz ‘with’ 
summative assessment because for our students the; “feedback and directed feed forward 
were linked into the next submission” just like the students of Parry, Larsen and Walsh 
(2008:1). 

For the study, we tested three hypotheses; 

• Hypothesis 1: Students’ scores will be higher on summative tasks and lower on 
taskswith formative assessment. 

o Correlation between summative assignment and formative assignment will 
be negative. 

o Correlation between summative quiz and formative assignment will be 
negative. 

• Hypothesis 2: Students with high scores on summative assignment will also have 
high score on quiz ‘with’ summative assessment. 

o Correlation between summative assignment and summative quiz will be 
positive. 

• Hypothesis 3: Students scores on quiz with summative assessment will not be 
related to their scores on quiz with informal assessment. 

o Correlation between summative quiz and informal quiz will be zero. 

Results 

Table 2 (below) displays the Descriptive statistic for the data. The four important values 
reported in the Table provide an overview of the marks distribution for both groups. 


Table 2: Descriptive statistic 



Group 1 (N = 60 




Descriptive 

Summative assessment 

Formative assessment 


Assignment 

Quiz 

Assignment 

Quiz 

Mean 

21.12 

5-72 

3-52 

3.03 

*Median 

22.00 

6.00 

3.00 

3.00 

Std. Deviation 

5.14 

1.28 

3.12 

1.63 

Skewness 

-•72 

-.96 

2.96 

.41 

Kurtosis 

-•31 

•55 

12.54 

-1.006 

minimum 

8.00 

2 

1.00 

1 

maximum 

28.00 

7 

20.00 

6 


*Field (2013) recommends reporting Median along-with Mean for skewed distributions. 


Students generally scored higher on tasks which were formally assessed; Mean values are 
higher for ‘with assessment’ tasks. The value of Kurtosis is < 3 for all tasks with one exception 
(large positive 12.54 >3 indicates a leptokurtic distribution with higher peak and fatter tails). 
The values of < 3 indicate Platykurtic distribution which is flatter with a wider peak and less 
probability for extreme values than for a normal distribution as the values are wider spread 
around the respective means. The values of SD‘s also confirm that the data set is more spread 
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out besides being negatively skewed for ‘with summative assessment’ tasks. When we 
compare these results with positively skewed distribution of scores on assignments and 
quizzes ‘with formative assessment,’ it can be concluded that students generally scored 
poorly on assignments with formative assessment. This also means that the difference 
between the positively skewed distributions for tasks assessed with formative strategy and 
the negatively skewed distributions for tasks assessed with summative grade can be taken as 
an indicator of how much students have learned while working on the assignments for their 
final grades. This lends support to our first Hypothesis. 

As the shape and spread statistics of the marks distribution above, i.e., a leptoykurtic/ 
Platykurtic and skewed distributions, point towards the data set falling short of meeting some 
of the basic requirements of a normal distribution, therefore, we also analyzed the graphical 
view of our data. By using Modal frequency analysis one can see that the value of Mode for 
assignment for summative assessment is much higher (Plot A, Mode = 26) than the 
assignment with formative assessment (Plot B, Mode = 2) and a large majority of the scores is 
concentrated toward the lower end of the distribution showing that students generally 
performed poorly on tasks that were informally assessed. The same pattern is visible with 
scores on quizzes (Plot C, Mode = 6 and Plot D, Mode = 2 respectively). 


Plot A 


Plot B 


Summative Assignment 



Summative Assignment 



Plot C 


Plot D 
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In order to test the strength and direction of the relationship between assessment and 
learning, we chose Spearman’s rho which is the non-parametric counterpart of Pearson 
correlation coefficient. Our data set met both the assumptions for using this coefficient; a) 
our variables are measured at interval level, i.e., assignment and quiz marks are 0-30 and 0-10 
respectively, and b) relationship between the two variables is monotonic (Field 2013). The 
Spearman’s rho test revealed a moderate negative correlation between the scores of 
Summative and Formative assessment which was statistically significant (rs[6o] = -.308, p < 
.01). A moderate negative correlation was also found between the scores of Summative quiz 
and Formative assignment. The relationship was statistically significant; (rs[6o] = -.405, p < 
.01). These results are in the expected direction and lend support to our first hypothesis that 
students performed better on tests for summative assessment. Based on these results we can 
also claim like Raupach et al (2013) that summative assessment is “more powerful driver of 
student learning” (p.i) as they generally put more effort into learning for formal assessment. 

The correlation test also indicated that the students’ scores for Summative quiz were weakly 
and negatively related to their scores on Summative assignment; (rs[6o] = -.118, p > .05). This 
relationship was neither in the expected direction (should have been positive) nor was 
significant statistically. Therefore, though conceptually a positive relationship between 
effective utilization of feedback and improvement of summative assignment makes sense, the 
empirical data does not support the thesis; hence we fail to reject the null hypothesis of no 
relation between feedback and further improvement. 

Spearman’s rho further indicated that the students’ scores on Summative quizwere unrelated 
to their scores on Formative quiz; (rs[6o] = -.026, p > .05). The relationship, though in the 
expected direction (with negative sign), was not significant statistically. Therefore, our third 
hypothesis is not proven. 


Discussion of the findings 

Assessment of teaching and learning of research methodology, like any other academic 
discipline, is a complex process. But unlike most other disciplines not much research has been 
done in this area (see Wagner, Garner 8< Kawulich, 2011 for review). The results of our study 
corroborate the existing evidence for assessment, especially summative, being a force behind 
student learning. Although the substantiation is mainly based on western experiences, our 
study by validating these findings raises an important question about the assessment context 
and its impact on student learning. In Pakistan, the learning environment has long been 
dominated by summative assessment at the end of academic year. A large majority of the 
students coming to the institutions of higher learning has more experiences of summative 
than formative assessment (Qureshi 2013). Moreover, significant number of educational 
institutions still practice summative assessment despite the fact that the Higher Education 
Commission of Pakistan has provided standard guidelines to ensure that all colleges and 
universities use more of formative assessment strategies in order to enhance student 
achievements. Against this backdrop the results of our study are not surprising but one of the 
‘non-significant’ relationships, i.e., the unrelated scores of quizzes, has implications for 
research and policy for higher education. In Pakistan there are two commonly held 
(mis)perceptions prevalent among policy makers; a) summative assessment is the root cause 
of the declining standards of education as it affects every aspect of teaching and learning in 
classrooms (Rehmani, 2003), and b) semester system is equal to formative assessment. To the 
best of our knowledge, no research based evidence is available in Pakistan to vouch for 
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formative assessment increasing or having no effect on students’ learning. At the same time 
the push for formative assessment especially by HEC, explicitly or implicitly, gives the 
impression that formative assessment is better despite no proof of consensus on the impact 
of formative and/ or summative assessment strategies on students’ learning in the literature. 

Furthermore, research on formative assessment itself is inconclusive as well as conditional; for 
instance, on one hand Yorke (2001) leans toward formative assessment because it is 
“dialogic” (p.117); students receive feedback and thus get a second chance for improving their 
academic performance. Similarly, Dunn and Mulvenon (2009) while favoring formative 
assessment also draw our attention toward the limited impact of formative assessment on 
student learning. On the other hand while not negating the potential of feedback for 
improving students’ formally assessed assignments, Crisp (2007), Duncan (2007) and Sadler 
(2010), caution educators that students do not pay much attention to instructors’ feedback 
which is considered to be the hallmark of formative assessment, therefore, feedback (or 
formative assessment by implication) may not always be as effective as educationists think 
and policy makers seem to believe. 

Similarly, while we may be in agreement with the analyses of theresearchers, mentioned 
above, we restrict ourselves from generalizing the findings of our study beyond the study 
sample in lieu of the limitations; our studyis based on convenience sample of intact groups 
which may or may not be representative of the student universe in Pakistan. However, given 
that Higher Education Commission of Pakistan has a mandate to promote research and 
research culture in Pakistani institutions of higher learning; it makes us one of the 
stakeholders. Therefore, we recommend that HEC should plan on commissioning a series of 
research studies in the area of assessment of teaching and learning especially of research. 
Such endeavor will increase awareness of multiple assessment techniques by providing 
empirical evidence about their effectiveness (or otherwise). At the same time, it will also 
encourage collaborative research work on more effective assessment techniques for the 
institutions of higher learning in Pakistan where stakes are higher for students and teachers 
alike because of ‘publish or perish’ culture for teachers and the requirement of submitting 
proof of research activities as part of the degree requirements for students. 

Conclusion 

Assessment of educational outcomes in any discipline area is a complex process. Researchers 
in this field have, by and large, remained divided on the question of whether summative or 
formative assessment is more successful in driving students’ learning. The present study has 
made animportant contribution to the small emerging body of literature depicting the 
experiences of transition societies like Pakistan. The input to the larger body of knowledge on 
the assessment of teaching and learning of research is also noteworthy for our study has 
explored two interrelated questions; to what extent assessment drives learning in a research 
course and what is the relationship between formative assessment and student learning for 
such a course? Although our research has provided empirical evidence that in our sample of 
students’ summative assessment was the engine that drove learning, we remain cautious in 
generalizing our findings beyond the sample. Nonetheless, the study has highlighted the 
paucity of literature in the area of assessment in Pakistan and the need for further research is 
indicated. 
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